计算能力和大型培训数据集的可用性增加,机器学习的成功助长了。假设它充分代表了在测试时遇到的数据,则使用培训数据来学习新模型或更新现有模型。这种假设受到中毒威胁的挑战,这种攻击会操纵训练数据,以损害模型在测试时的表现。尽管中毒已被认为是行业应用中的相关威胁,到目前为止,已经提出了各种不同的攻击和防御措施,但对该领域的完整系统化和批判性审查仍然缺失。在这项调查中,我们在机器学习中提供了中毒攻击和防御措施的全面系统化,审查了过去15年中该领域发表的100多篇论文。我们首先对当前的威胁模型和攻击进行分类,然后相应地组织现有防御。虽然我们主要关注计算机视觉应用程序,但我们认为我们的系统化还包括其他数据模式的最新攻击和防御。最后,我们讨论了中毒研究的现有资源,并阐明了当前的局限性和该研究领域的开放研究问题。
translated by 谷歌翻译
With the proliferation of AI-enabled software systems in smart manufacturing, the role of such systems moves away from a reactive to a proactive role that provides context-specific support to manufacturing operators. In the frame of the EU funded Teaming.AI project, we identified the monitoring of teaming aspects in human-AI collaboration, the runtime monitoring and validation of ethical policies, and the support for experimentation with data and machine learning algorithms as the most relevant challenges for human-AI teaming in smart manufacturing. Based on these challenges, we developed a reference software architecture based on knowledge graphs, tracking and scene analysis, and components for relational machine learning with a particular focus on its scalability. Our approach uses knowledge graphs to capture productand process specific knowledge in the manufacturing process and to utilize it for relational machine learning. This allows for contextspecific recommendations for actions in the manufacturing process for the optimization of product quality and the prevention of physical harm. The empirical validation of this software architecture will be conducted in cooperation with three large-scale companies in the automotive, energy systems, and precision machining domain. In this paper we discuss the identified challenges for such a reference software architecture, present its preliminary status, and sketch our further research vision in this project.
translated by 谷歌翻译
最近引入了一种模糊的理论分析方法,该方法可导致有效而健壮的模型,同时自动解决与参数深模型相关的典型问题。但是,仍然无法使用模糊理论分析深层模型的形式概念化。本文使用测量理论基础介绍了\ emph {成员映射}的概念,用于通过属性值表示数据点(由模糊理论动机)。可以利用用于数据表示学习的会员资格映射的属性,它是在数据空间中给定的数据点上提供插值。考虑了基于成员映射的数据表示模型的分析方法。
translated by 谷歌翻译
Explainability has become a central requirement for the development, deployment, and adoption of machine learning (ML) models and we are yet to understand what explanation methods can and cannot do. Several factors such as data, model prediction, hyperparameters used in training the model, and random initialization can all influence downstream explanations. While previous work empirically hinted that explanations (E) may have little relationship with the prediction (Y), there is a lack of conclusive study to quantify this relationship. Our work borrows tools from causal inference to systematically assay this relationship. More specifically, we measure the relationship between E and Y by measuring the treatment effect when intervening on their causal ancestors (hyperparameters) (inputs to generate saliency-based Es or Ys). We discover that Y's relative direct influence on E follows an odd pattern; the influence is higher in the lowest-performing models than in mid-performing models, and it then decreases in the top-performing models. We believe our work is a promising first step towards providing better guidance for practitioners who can make more informed decisions in utilizing these explanations by knowing what factors are at play and how they relate to their end task.
translated by 谷歌翻译
In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardware implementation of nonlinear activation functions. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware. The performance in Q-factor is presented for the cases of bidirectional long-short-term memory coupled with convolutional NN (biLSTM + CNN) equalizer, CNN equalizer, and standard 1-StpS digital back-propagation (DBP) for the simulation and experiment propagation of a single channel dual-polarization (SC-DP) 16QAM at 34 GBd along 17x70km of LEAF. The biLSTM+CNN equalizer provides a similar result to DBP and a 1.7 dB Q-factor gain compared with the chromatic dispersion compensation baseline in the experimental dataset. After that, we assess the Q-factor and the impact of hardware utilization when approximating the activation functions of NN using Taylor series, piecewise linear, and look-up table (LUT) approximations. We also show how to mitigate the approximation errors with extra training and provide some insights into possible gradient problems in the LUT approximation. Finally, to evaluate the complexity of hardware implementation to achieve 400G throughput, fixed-point NN-based equalizers with approximated activation functions are developed and implemented in an FPGA.
translated by 谷歌翻译
Geospatial Information Systems are used by researchers and Humanitarian Assistance and Disaster Response (HADR) practitioners to support a wide variety of important applications. However, collaboration between these actors is difficult due to the heterogeneous nature of geospatial data modalities (e.g., multi-spectral images of various resolutions, timeseries, weather data) and diversity of tasks (e.g., regression of human activity indicators or detecting forest fires). In this work, we present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias, pre-trained on large amounts of unlabelled earth observation data in a self-supervised manner. We envision how such a model may facilitate cooperation between members of the community. We show preliminary results on the first step of the roadmap, where we instantiate an architecture that can process a wide variety of geospatial data modalities and demonstrate that it can achieve competitive performance with domain-specific architectures on tasks relating to the U.N.'s Sustainable Development Goals.
translated by 谷歌翻译
ParaDime is a framework for parametric dimensionality reduction (DR). In parametric DR, neural networks are trained to embed high-dimensional data items in a low-dimensional space while minimizing an objective function. ParaDime builds on the idea that the objective functions of several modern DR techniques result from transformed inter-item relationships. It provides a common interface to specify these relations and transformations and to define how they are used within the losses that govern the training process. Through this interface, ParaDime unifies parametric versions of DR techniques such as metric MDS, t-SNE, and UMAP. Furthermore, it allows users to fully customize each aspect of the DR process. We show how this ease of customization makes ParaDime suitable for experimenting with interesting techniques, such as hybrid classification/embedding models or supervised DR, which opens up new possibilities for visualizing high-dimensional data.
translated by 谷歌翻译
在线持续学习(OCL)旨在通过单个通过数据从非平稳数据流进行逐步训练神经网络。基于彩排的方法试图用少量的内存近似观察到的输入分布,并以后重新审视它们以避免忘记。尽管具有强烈的经验表现,但排练方法仍然遭受了过去数据损失景观和记忆样本的差异。本文重新讨论了在线设置中的排练动态。我们从偏见和动态的经验风险最小化的角度从固有的内存过度拟合风险中提供了理论见解,并检查重复排练的优点和限制。受我们的分析的启发,一个简单而直观的基线,重复的增强彩排(RAR)旨在解决在线彩排的拟合不足的困境。令人惊讶的是,在四个相当不同的OCL基准测试中,这种简单的基线表现优于香草排练9%-17%,并且显着改善了基于最新的彩排方法miR,ASER和SCR。我们还证明,RAR成功地实现了过去数据的损失格局和其学习轨迹中的高损失山脊厌恶的准确近似。进行了广泛的消融研究,以研究重复和增强彩排和增强学习(RL)之间的相互作用(RL),以动态调整RAR的超参数以平衡在线稳定性 - 塑性权衡折衷。
translated by 谷歌翻译
在本文中,我们建议采用MDE范式来开发机器学习(ML)的软件系统,重点关注物联网(IoT)域。我们说明了如何将两种最先进的开源建模工具,即蒙蒂安娜和ML-Quadrat用于此目的,如案例研究所证明的那样。案例研究说明了使用ML使用MNIST参考数据集对手写数字的自动图像识别的ML,特别是深人造神经网络(ANN),并将机器学习组件集成到物联网系统中。随后,我们对两个框架进行了功能比较,设置了一个分析基础,以包括广泛的设计考虑因素,例如问题域,ML集成到较大系统中的方法以及支持的ML方法以及主题最近对ML社区的强烈兴趣,例如Automl和MLOP。因此,本文的重点是阐明ML域中MDE方法的潜力。这支持ML工程师开发(ML/软件)模型而不是实施代码,并通过启用ML功能作为IoT或IoT的组件的现成集成来实现设计的可重复性和模块化。网络物理系统。
translated by 谷歌翻译
医学成像中各种各样的分布和分布数据使通用异常检测成为一项艰巨的任务。最近,已经开发了许多自我监督的方法,这些方法是对健康数据的端到端模型,并具有合成异常的增强。但是,很难比较这些方法,因为尚不清楚绩效的收益是从任务本身还是围绕其培训管道来进行的。也很难评估一项任务是否可以很好地通用通用异常检测,因为它们通常仅在有限的异常范围内进行测试。为了协助这一点,我们开发了NOOD,该框架适应NNU-NET,以比较自我监督的异常定位方法。通过将综合,自我监督的任务隔离在其余培训过程中,我们对任务进行了更忠实的比较,同时还可以快速简便地评估给定数据集的工作流程。使用此功能,我们实施了当前的最新任务,并在具有挑战性的X射线数据集上对其进行了评估。
translated by 谷歌翻译